Skip to content

Implement persistence layer with binary serializer and state management#11

Merged
Satlykovs merged 4 commits intomainfrom
satlykovs/feat/analyzer-persistence
Mar 22, 2026
Merged

Implement persistence layer with binary serializer and state management#11
Satlykovs merged 4 commits intomainfrom
satlykovs/feat/analyzer-persistence

Conversation

@Satlykovs
Copy link
Owner

Цель изменений

Реализация слоя персистентности (Persistence Layer) для сохранения эталонов учителей и результатов анализа учеников. Оптимизация хранения тяжелых спектральных данных.

Что сделано

Infrastructure (Persistence & Serialization):

  • Binary Storage: Реализован FeatureBinarySerializer для упаковки матриц признаков в формат bytea (PostgreSQL).
  • JPA Integration: Созданы сущности AssignmentEntity и SubmissionEntity.
  • JSONB Support: Настроен нативный маппинг Hibernate 6 для хранения детального фидбека (List<FeedbackEvent>) в формате JSONB.
  • Adapters: Реализованы порты ReferenceRepository и ResultRepository через Jpa-адаптеры.

Application (Workflow & Resilience):

  • State Machine: Внедрено управление статусами (ANALYZING, COMPLETED, FAILED). Теперь система отслеживает жизненный цикл задачи.
  • Error Handling: Реализован механизм handleFailure, гарантирующий запись текста ошибки в БД при падении пайплайна FFmpeg или экстракции.
  • UUID Mapping: Настроена сквозная связь между ключами файлов в S3 и Primary Keys в базе данных.

План на завтра(или на будущее =))

  • Добавление недостающих Javadoc.
  • Расширение Unit-тестов для адаптеров репозиториев.

@Satlykovs Satlykovs self-assigned this Mar 21, 2026
@coderabbitai
Copy link

coderabbitai bot commented Mar 21, 2026

Warning

Rate limit exceeded

@Satlykovs has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 24 minutes and 3 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 7e7fef13-5c7d-4c20-a425-28ee46f0979b

📥 Commits

Reviewing files that changed from the base of the PR and between dff6548 and 814ee9e.

📒 Files selected for processing (5)
  • backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/application/AudioAnalysisUseCase.java
  • backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/domain/port/ReferenceRepository.java
  • backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/domain/port/ResultRepository.java
  • backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/persistence/adapter/SubmissionPersistenceAdapter.java
  • backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/utils/FeatureBinarySerializer.java
📝 Walkthrough

Walkthrough

Extracts a UUID from file keys, persists lifecycle statuses (ANALYZING/COMPLETED/FAILED), branches processing for references vs submissions (save features vs evaluate+save results), adds JPA entities/repositories and adapters, introduces binary feature serializer, updates tests, and improves error handling and logging.

Changes

Cohort / File(s) Summary
Use Case
backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/application/AudioAnalysisUseCase.java
Extract UUID from fileKey, set status to ANALYZING, branch on bucket into reference vs submission flows, persist features or evaluate+save results, update statuses via helper, log full stack traces and wrap exceptions, add helper methods for status and bucket-specific processing.
Domain Ports
backend/.../domain/port/ReferenceRepository.java, backend/.../domain/port/ResultRepository.java
ReferenceRepository: renamed/adjusted methods to save(UUID assignmentId, FeatureSequence), Optional<FeatureSequence> findFeaturesById(UUID), added updateStatus(UUID, AudioProcessingStatus, String). ResultRepository: added Optional<UUID> findAssignmentIdBySubmissionId(UUID) and updateStatus(UUID, AudioProcessingStatus, String).
Persistence Adapters
backend/.../infrastructure/persistence/adapter/AssignmentPersistenceAdapter.java, backend/.../infrastructure/persistence/adapter/SubmissionPersistenceAdapter.java
New Spring components implementing ports: persist/deserialize feature cache, save reference features to assignments, persist analysis results to submissions, and delegate status/error updates to JPA repos.
JPA Entities
backend/.../infrastructure/persistence/entity/AssignmentEntity.java, backend/.../infrastructure/persistence/entity/SubmissionEntity.java
New entities: AssignmentEntity (id, status, referenceSpectreCache, errorMessage) and SubmissionEntity (id, assignmentId, status, scores, errorMessage, JSON analysisFeedback).
JPA Repositories
backend/.../infrastructure/persistence/repository/JpaAssignmentRepository.java, backend/.../infrastructure/persistence/repository/JpaSubmissionRepository.java
New Spring Data interfaces with @Modifying JPQL updateStatus(UUID, AudioProcessingStatus, String) methods to atomically set status and error message.
Serialization Utility
backend/.../infrastructure/utils/FeatureBinarySerializer.java
New little-endian binary serializer/deserializer for FeatureSequence with header (frameCount, binCount, frameRate), payload floats, validation, overflow checks, and error handling.
Tests
backend/.../application/AudioAnalysisUseCaseTest.java, backend/.../infrastructure/utils/FeatureBinarySerializerTest.java, backend/.../infrastructure/converter/FfmpegAudioConverterTest.java
UseCase tests switched to UUID-based keys, added mocks and ordering/status verifications and invalid-UUID test; added comprehensive serializer tests; trivial comment change in ffmpeg test.

Sequence Diagram

sequenceDiagram
    participant Client
    participant UseCase as AudioAnalysisUseCase
    participant Storage
    participant Extractor as FeatureExtractor
    participant RefRepo as ReferenceRepository
    participant ResRepo as ResultRepository
    participant Evaluator as PerformanceEvaluator

    Client->>UseCase: execute(bucket, fileKey)
    activate UseCase
    Note over UseCase: extract UUID from fileKey

    alt references
        UseCase->>RefRepo: updateStatus(UUID, ANALYZING, null)
        UseCase->>Storage: downloadAudio(fileKey)
        UseCase->>Extractor: extractFeatures(audio)
        UseCase->>RefRepo: save(UUID, features)
    else submissions
        UseCase->>ResRepo: updateStatus(UUID, ANALYZING, null)
        UseCase->>Storage: downloadAudio(fileKey)
        UseCase->>Extractor: extractFeatures(audio)
        UseCase->>ResRepo: findAssignmentIdBySubmissionId(UUID)
        ResRepo-->>UseCase: assignmentId
        UseCase->>RefRepo: findFeaturesById(assignmentId)
        RefRepo-->>UseCase: referenceFeatures
        UseCase->>Evaluator: evaluate(referenceFeatures, submissionFeatures)
        Evaluator-->>UseCase: AnalysisResult
        UseCase->>ResRepo: save(UUID, AnalysisResult)
        UseCase->>UseCase: generate debug heatmap (errors logged)
    end

    alt success
        UseCase->>RefRepo: updateStatus(targetUUID, COMPLETED, null)
        UseCase->>ResRepo: updateStatus(targetUUID, COMPLETED, null)
    else failure
        UseCase->>RefRepo: updateStatus(targetUUID, FAILED, errorMsg)
        UseCase->>ResRepo: updateStatus(targetUUID, FAILED, errorMsg)
        UseCase-->>Client: throw RuntimeException
    end
    deactivate UseCase
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~65 minutes

Possibly related PRs

Suggested reviewers

  • ParovozikThomas
  • men229

Poem

🐰 I hop on UUIDs through the night,

I mark them ANALYZING, then DONE in light,
I stitch frames into binary thread,
Scores and heatmaps hum ahead,
A happy rabbit logs each byte.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 34.21% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: implementing a persistence layer with binary serialization and state management, which aligns with the changeset's focus on infrastructure (JPA entities, binary serializer, adapters) and application-level state tracking.
Description check ✅ Passed The description is directly related to the changeset. It explains the goal (persistence layer for storage), details infrastructure work (FeatureBinarySerializer, JPA entities, JSONB support, adapters) and application work (state machine, error handling, UUID mapping), all of which match the actual changes in the PR.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch satlykovs/feat/analyzer-persistence

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 14

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/application/AudioAnalysisUseCase.java`:
- Around line 68-72: The catch block in AudioAnalysisUseCase uses e.getMessage()
which can be null; create a non-null error string (e.g., String errorMsg =
e.getMessage() != null ? e.getMessage() : e.toString()) and use that variable in
the log.error call, in updateStatus(bucket, entityId,
AudioProcessingStatus.FAILED, errorMsg), and when constructing the
RuntimeException to ensure diagnostic information is preserved; update the catch
block around the existing log.error/updateStatus/throw to reference this
errorMsg instead of directly calling e.getMessage().
- Line 107: The call to debugVisualizer.generateHeatmap(result, "debug_" +
submissionId + ".png") in AudioAnalysisUseCase can throw after
resultRepository.save(...) and cause the submission to be marked FAILED; wrap
the heatmap generation in a try-catch so any exception is caught and logged
(e.g., via the class logger) and not rethrown, ensuring the saved analysis
result and submission status are not changed by heatmap failures; keep
resultRepository.save(...) as the canonical success path and make heatmap
generation non-fatal (or run it asynchronously) so failures only produce a log
entry.

In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/domain/port/ReferenceRepository.java`:
- Around line 25-27: Change the port API so findById returns
Optional<FeatureSequence> instead of possibly-null FeatureSequence and
standardize the UUID parameter name to assignmentId to make the intent explicit;
update the method signature of findById(UUID assignmentId) ->
Optional<FeatureSequence> findById(UUID assignmentId) and rename the UUID
parameter in updateStatus(UUID id, AudioProcessingStatus status, String
errorMessage) to updateStatus(UUID assignmentId, AudioProcessingStatus status,
String errorMessage), and then update all implementations and callers to handle
Optional properly and to use assignmentId consistently.

In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/domain/port/ResultRepository.java`:
- Around line 12-14: Change the lookup contract on ResultRepository: make
findAssignmentIdBySubmissionId return Optional<UUID> instead of possibly-null
UUID (so callers must explicitly handle empty) and update all implementations
and call-sites accordingly; also clarify the updateStatus parameter name from
generic id to a clear identifier (e.g., resultId or assignmentId) in the
ResultRepository.updateStatus signature and propagate that rename to
implementations and usages to avoid ID ambiguity.

In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/persistence/adapter/SubmissionPersistenceAdapter.java`:
- Around line 37-43: Add a read-only transaction annotation to the
findAssignmentIdBySubmissionId method in SubmissionPersistenceAdapter: annotate
the method (or class if preferred) with `@Transactional`(readOnly = true) so the
repository.findById(...) read operation can benefit from DB-level optimizations
and the intent is explicit; ensure the import for
org.springframework.transaction.annotation.Transactional is present and apply it
to the method containing findAssignmentIdBySubmissionId.

In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/persistence/entity/AssignmentEntity.java`:
- Around line 10-29: Add a versioned DB migration that creates/updates the
assignments table to match AssignmentEntity: add the reference_spectre_cache
column (binary) and error_message column (text/varchar) and, if the table itself
is new in this PR, include full CREATE TABLE for assignments with id (UUID) and
status (string) columns; implement this as a Flyway (V###__describe.sql) or
Liquibase changeSet to ensure deterministic rollout rather than relying on
ddl-auto. Locate the schema differences by inspecting AssignmentEntity and
create an idempotent SQL migration that ALTERs the existing assignments table to
add reference_spectre_cache (bytea/BLOB) and error_message (text) and a
migration entry in your migrations folder so CI/production runs the versioned
migration during deployment.
- Around line 24-28: The errorMessage field in AssignmentEntity is currently
mapped with `@Column` and will default to varchar(255); update the `@Column`
annotation on the errorMessage field to set columnDefinition = "text" (i.e.,
`@Column`(name = "error_message", columnDefinition = "text")) so the database
column can store longer error texts without truncation; locate the errorMessage
field in the AssignmentEntity class and modify its `@Column` accordingly.

In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/persistence/entity/SubmissionEntity.java`:
- Around line 23-30: SubmissionEntity's required fields (id, assignmentId,
status) lack nullability constraints which can permit invalid rows; update the
JPA annotations on the UUID id, assignmentId and the AudioProcessingStatus
status in class SubmissionEntity to declare non-nullable columns (e.g., add
`@Column`(nullable = false) to assignmentId and status and ensure the id column is
marked non-nullable via `@Column`(nullable = false) or the existing `@Id` mapping),
then run or add the corresponding schema migration to enforce NOT NULL at the
database level.

In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/persistence/repository/JpaAssignmentRepository.java`:
- Around line 14-15: Update the repository update methods to use explicit
parameter binding and return the update count: in JpaAssignmentRepository change
the updateStatus method to annotate its parameters with `@Param`("id"),
`@Param`("status") and `@Param`("error") and change the return type from void to int
so the method returns the number of rows updated; apply the identical edits to
the corresponding method in JpaSubmissionRepository (add `@Param` annotations with
the same names used in the JPQL and change the return type to int) so
missing-row updates can be detected at runtime.

In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/utils/FeatureBinarySerializer.java`:
- Around line 50-56: The overflow check in FeatureBinarySerializer currently
computes totalElements and checks totalElements * Float.BYTES against
Integer.MAX_VALUE but omits HEADER_SIZE; update the guard to compute
requiredBytes = HEADER_SIZE + (long)frameCount * binCount * Float.BYTES (use a
long) and compare requiredBytes against Integer.MAX_VALUE before casting to int,
then set totalBytes = (int) requiredBytes; apply the same change to the
analogous check around lines 94-100 so both places include HEADER_SIZE and use
long arithmetic to prevent overflow.
- Around line 28-29: The JavaDoc in FeatureBinarySerializer currently documents
`@throws` IllegalArgumentException for null frames but the implementation throws
NullPointerException; update the JavaDoc to match the code by replacing the
`@throws` IllegalArgumentException clause with `@throws` NullPointerException (or
alternately change the method implementation to throw IllegalArgumentException
consistently), ensuring the javadoc for the method in FeatureBinarySerializer
that describes the frames parameter reflects the actual exception type thrown.

In
`@backend/smartjam-analyzer/src/test/java/com/smartjam/smartjamanalyzer/application/AudioAnalysisUseCaseTest.java`:
- Around line 58-83: Add a new test that covers the "submissions" flow (e.g.,
shouldProcessStudentSubmission) because shouldProcessInOrder only covers
"references"; in that test mock workspaceFactory.create(),
storage.downloadAudioFile(...), converter.convertToStandardWav(...),
featureExtractor.extract(...) to return student Features, mock
resultRepository.findAssignmentIdBySubmissionId(VALID_UUID) to return an
assignmentId and referenceRepository.findById(assignmentId) to return teacher
Features, mock performanceEvaluator.evaluate(teacherFeatures, studentFeatures)
to return an AnalysisResult, run useCase.execute(bucket, fileKey) with
bucket="submissions", then verify resultRepository.updateStatus(VALID_UUID,
AudioProcessingStatus.ANALYZING, null), resultRepository.save(VALID_UUID,
mockResult), and debugVisualizer.generateHeatmap(eq(mockResult),
contains("debug_")) to ensure the submission path invokes
findAssignmentIdBySubmissionId, findById, evaluate and generateHeatmap.

In
`@backend/smartjam-analyzer/src/test/java/com/smartjam/smartjamanalyzer/infrastructure/converter/FfmpegAudioConverterTest.java`:
- Line 20: Convert the inline TODO in FfmpegAudioConverterTest into a tracked
backlog item and/or an explicit pending test: create a ticket in your issue
tracker describing the missing "filter passthrough" test and reference that
ticket ID in the test comment, or mark the test as pending/disabled (e.g., with
JUnit's `@Disabled`) and include the ticket ID and a short description; ensure the
TODO text in the FfmpegAudioConverterTest class is replaced with the ticket
reference and a one-line note that the filter propagation test will be
implemented against the ticket so it won't be silently forgotten.

In
`@backend/smartjam-analyzer/src/test/java/com/smartjam/smartjamanalyzer/infrastructure/utils/FeatureBinarySerializerTest.java`:
- Around line 50-55: The test method deserializeTooShortReturnsEmpty currently
asserts that FeatureBinarySerializer.deserialize(shortData) returns null, but
its `@DisplayName` says "returns empty list"; update the `@DisplayName` to match the
assertion (e.g., "Десериализация слишком короткого массива возвращает null") so
the description aligns with the actual assertion and intent of the test.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 383d4815-0da5-45c3-abbf-85e361c6d693

📥 Commits

Reviewing files that changed from the base of the PR and between 3a88d07 and fa52afe.

📒 Files selected for processing (13)
  • backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/application/AudioAnalysisUseCase.java
  • backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/domain/port/ReferenceRepository.java
  • backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/domain/port/ResultRepository.java
  • backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/persistence/adapter/AssignmentPersistenceAdapter.java
  • backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/persistence/adapter/SubmissionPersistenceAdapter.java
  • backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/persistence/entity/AssignmentEntity.java
  • backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/persistence/entity/SubmissionEntity.java
  • backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/persistence/repository/JpaAssignmentRepository.java
  • backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/persistence/repository/JpaSubmissionRepository.java
  • backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/utils/FeatureBinarySerializer.java
  • backend/smartjam-analyzer/src/test/java/com/smartjam/smartjamanalyzer/application/AudioAnalysisUseCaseTest.java
  • backend/smartjam-analyzer/src/test/java/com/smartjam/smartjamanalyzer/infrastructure/converter/FfmpegAudioConverterTest.java
  • backend/smartjam-analyzer/src/test/java/com/smartjam/smartjamanalyzer/infrastructure/utils/FeatureBinarySerializerTest.java

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (4)
backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/domain/port/ResultRepository.java (1)

22-29: 🛠️ Refactor suggestion | 🟠 Major

Use an explicit “not found” contract for assignment lookup.

UUID findAssignmentIdBySubmissionId(...) still encodes absence via exception side-effects. Prefer Optional<UUID> so callers handle the missing-link path explicitly.

Proposed API adjustment
+import java.util.Optional;
@@
-    UUID findAssignmentIdBySubmissionId(UUID submissionId);
+    Optional<UUID> findAssignmentIdBySubmissionId(UUID submissionId);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/domain/port/ResultRepository.java`
around lines 22 - 29, The method ResultRepository.findAssignmentIdBySubmissionId
currently returns UUID and relies on throwing at runtime to signal "not found";
change its signature to return java.util.Optional<UUID> (i.e., Optional<UUID>
findAssignmentIdBySubmissionId(UUID submissionId)), update the Javadoc to
describe Optional usage instead of throwing, and adjust all implementing classes
and callers to handle Optional.empty() instead of catching an exception so the
absence case is explicit.
backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/persistence/repository/JpaAssignmentRepository.java (1)

17-19: ⚠️ Potential issue | 🟠 Major

Harden JPQL update binding and expose affected-row count.

Named parameters are used without explicit @Param, and void hides missing-row updates. Make binding explicit and return int.

Proposed fix
 import org.springframework.data.jpa.repository.JpaRepository;
 import org.springframework.data.jpa.repository.Modifying;
 import org.springframework.data.jpa.repository.Query;
+import org.springframework.data.repository.query.Param;
@@
-    void updateStatus(UUID id, AudioProcessingStatus status, String error);
+    int updateStatus(
+            `@Param`("id") UUID id,
+            `@Param`("status") AudioProcessingStatus status,
+            `@Param`("error") String error);
#!/bin/bash
set -euo pipefail

echo "== Compiler parameter-name retention =="
fd -a 'pom.xml|build.gradle|build.gradle.kts' | xargs rg -n "maven-compiler-plugin|-parameters|parameters|sourceCompatibility|targetCompatibility|--release"

echo
echo "== Named-parameter JPQL methods missing `@Param` =="
rg -nP --type=java '@Query\("(?s).*:[a-zA-Z_]\w*.*"\)\s*\R\s*(?:void|int)\s+\w+\((?:(?!@Param).)*\)\s*;'
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/persistence/repository/JpaAssignmentRepository.java`
around lines 17 - 19, The JPQL update method updateStatus in
JpaAssignmentRepository uses named parameters without `@Param` bindings and
returns void, hiding whether rows were affected; change the method signature to
return int and add explicit `@Param` annotations for id, status and error (while
keeping `@Modifying`(clearAutomatically = true) and the existing `@Query`) so Spring
can bind the named parameters correctly and the caller receives the affected-row
count.
backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/utils/FeatureBinarySerializer.java (1)

95-101: ⚠️ Potential issue | 🟠 Major

Fix deserialization byte-size overflow guard (header still excluded).

The bound check can still pass values where HEADER_SIZE + payload overflows int, producing a wrong expectedBytes and invalid length validation.

Proposed fix
-        long totalElements = (long) frameCount * binCount;
-        if (totalElements > Integer.MAX_VALUE / Float.BYTES) {
-            throw new IllegalArgumentException("Too many elements to deserialize: " + totalElements);
-        }
-
-        int expectedBytes = HEADER_SIZE + frameCount * binCount * Float.BYTES;
+        long totalElements = (long) frameCount * binCount;
+        long requiredBytes = HEADER_SIZE + totalElements * Float.BYTES;
+        if (requiredBytes > Integer.MAX_VALUE) {
+            throw new IllegalArgumentException("Too many elements to deserialize: " + totalElements);
+        }
+
+        int expectedBytes = (int) requiredBytes;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/utils/FeatureBinarySerializer.java`
around lines 95 - 101, The current overflow guard in FeatureBinarySerializer
computes totalElements and checks totalElements * Float.BYTES against
Integer.MAX_VALUE but still allows HEADER_SIZE + payload to overflow when
computing expectedBytes; update the check to compute payloadBytes as a long
(payloadBytes = totalElements * (long) Float.BYTES), then verify that
HEADER_SIZE + payloadBytes <= Integer.MAX_VALUE before casting to int, and use
the long sum to set expectedBytes safely (int expectedBytes = (int)(HEADER_SIZE
+ payloadBytes)); ensure you reference the existing variables totalElements,
frameCount, binCount, HEADER_SIZE, payloadBytes and expectedBytes in the fix.
backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/application/AudioAnalysisUseCase.java (1)

68-75: ⚠️ Potential issue | 🟡 Minor

Persist errorMsg instead of re-reading e.getMessage().

Line 70 already normalizes null exception messages, but Line 74 still writes the raw message into the DB. That leaves FAILED records without an error text for exceptions whose message is null.

🛠️ Proposed fix
-            updateStatus(bucket, entityId, AudioProcessingStatus.FAILED, e.getMessage());
+            updateStatus(bucket, entityId, AudioProcessingStatus.FAILED, errorMsg);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/application/AudioAnalysisUseCase.java`
around lines 68 - 75, In AudioAnalysisUseCase's catch block, you normalize the
exception message into the local variable errorMsg but then pass e.getMessage()
to updateStatus; change the updateStatus call to pass errorMsg (not
e.getMessage()) so FAILED records persist the normalized message (reference:
errorMsg variable, updateStatus method, catch(Exception e) in
AudioAnalysisUseCase).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/application/AudioAnalysisUseCase.java`:
- Around line 30-38: In execute (AudioAnalysisUseCase.execute) validate the
incoming bucket string up front against the allowed set ("references",
"submissions"); if it's not one of those, log/mark an error and return before
creating a Workspace or calling updateStatus so we never set ANALYZING for
unknown buckets. Ensure the subsequent updateStatus(...) and downstream saves
only follow the same two-value contract (use referenceRepository for
"references" and resultRepository for "submissions") and do not attempt any save
or status transitions for invalid buckets.

In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/domain/port/ReferenceRepository.java`:
- Around line 20-35: Update the JavaDoc on the ReferenceRepository port to match
the API: for findFeaturesById(UUID assignmentId) state that it returns an
Optional<FeatureSequence> which will be empty if no features are found (do not
mention null), and clarify that the parameter assignmentId refers to the
assignment's unique identifier (not a submission); for updateStatus(UUID
assignmentId, AudioProcessingStatus status, String errorMessage) correct the
param text to say assignmentId is the assignment identifier and keep
errorMessage as optional failure description; ensure the return and param
descriptions reference FeatureSequence, Optional, findFeaturesById,
updateStatus, AudioProcessingStatus, and assignmentId consistently.

In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/persistence/entity/SubmissionEntity.java`:
- Around line 45-46: The errorMessage field in SubmissionEntity should use the
same TEXT column definition as AssignmentEntity to avoid VARCHAR truncation;
update the `@Column` annotation on the errorMessage field in SubmissionEntity (the
private String errorMessage) to include columnDefinition = "TEXT" (e.g.,
`@Column`(name = "error_message", columnDefinition = "TEXT") ) so long
FFmpeg/extractor messages are not truncated.

In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/persistence/repository/JpaSubmissionRepository.java`:
- Around line 24-26: Update JpaSubmissionRepository.updateStatus to return an
int and use explicit `@Param` bindings: annotate the method parameters with
`@Param`("id"), `@Param`("status") and `@Param`("error") so the named parameters in
the JPQL ("UPDATE SubmissionEntity s SET s.status = :status, s.errorMessage =
:error WHERE s.id = :id") are unambiguous, and change the return type from void
to int so callers can check the number of rows affected (0 means no update).
Keep the `@Modifying` annotation and ensure parameter names exactly match the JPQL
names (status, error, id) and the AudioProcessingStatus type for the status
parameter.

In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/utils/FeatureBinarySerializer.java`:
- Around line 29-37: Update the JavaDoc for FeatureBinarySerializer.serialize to
match the implementation: state that if the sequence has no frames the method
returns an empty byte[] (instead of throwing IllegalArgumentException); keep the
documented IllegalArgumentException conditions for null sequence
(Objects.requireNonNull) and for frames with differing lengths if that check
exists elsewhere, and reference FeatureSequence.frames() in the doc so callers
know empty-frame input yields an empty byte[] result.

---

Duplicate comments:
In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/application/AudioAnalysisUseCase.java`:
- Around line 68-75: In AudioAnalysisUseCase's catch block, you normalize the
exception message into the local variable errorMsg but then pass e.getMessage()
to updateStatus; change the updateStatus call to pass errorMsg (not
e.getMessage()) so FAILED records persist the normalized message (reference:
errorMsg variable, updateStatus method, catch(Exception e) in
AudioAnalysisUseCase).

In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/domain/port/ResultRepository.java`:
- Around line 22-29: The method ResultRepository.findAssignmentIdBySubmissionId
currently returns UUID and relies on throwing at runtime to signal "not found";
change its signature to return java.util.Optional<UUID> (i.e., Optional<UUID>
findAssignmentIdBySubmissionId(UUID submissionId)), update the Javadoc to
describe Optional usage instead of throwing, and adjust all implementing classes
and callers to handle Optional.empty() instead of catching an exception so the
absence case is explicit.

In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/persistence/repository/JpaAssignmentRepository.java`:
- Around line 17-19: The JPQL update method updateStatus in
JpaAssignmentRepository uses named parameters without `@Param` bindings and
returns void, hiding whether rows were affected; change the method signature to
return int and add explicit `@Param` annotations for id, status and error (while
keeping `@Modifying`(clearAutomatically = true) and the existing `@Query`) so Spring
can bind the named parameters correctly and the caller receives the affected-row
count.

In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/utils/FeatureBinarySerializer.java`:
- Around line 95-101: The current overflow guard in FeatureBinarySerializer
computes totalElements and checks totalElements * Float.BYTES against
Integer.MAX_VALUE but still allows HEADER_SIZE + payload to overflow when
computing expectedBytes; update the check to compute payloadBytes as a long
(payloadBytes = totalElements * (long) Float.BYTES), then verify that
HEADER_SIZE + payloadBytes <= Integer.MAX_VALUE before casting to int, and use
the long sum to set expectedBytes safely (int expectedBytes = (int)(HEADER_SIZE
+ payloadBytes)); ensure you reference the existing variables totalElements,
frameCount, binCount, HEADER_SIZE, payloadBytes and expectedBytes in the fix.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: f3eab2ae-436d-4ff5-85c9-f6bc8fcb3510

📥 Commits

Reviewing files that changed from the base of the PR and between fa52afe and a55720a.

📒 Files selected for processing (11)
  • backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/application/AudioAnalysisUseCase.java
  • backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/domain/port/ReferenceRepository.java
  • backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/domain/port/ResultRepository.java
  • backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/persistence/adapter/AssignmentPersistenceAdapter.java
  • backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/persistence/adapter/SubmissionPersistenceAdapter.java
  • backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/persistence/entity/AssignmentEntity.java
  • backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/persistence/entity/SubmissionEntity.java
  • backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/persistence/repository/JpaAssignmentRepository.java
  • backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/persistence/repository/JpaSubmissionRepository.java
  • backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/utils/FeatureBinarySerializer.java
  • backend/smartjam-analyzer/src/test/java/com/smartjam/smartjamanalyzer/infrastructure/utils/FeatureBinarySerializerTest.java

Comment on lines 30 to +38
public void execute(String bucket, String fileKey) {
try (Workspace workspace = workspaceFactory.create()) {
UUID entityId = extractUuid(fileKey);

// TODO: Добавить обработку(проверку типа) входящего файла

// TODO: Добавить нормальный сбор метрик

// TODO: Добавить обработку(проверку типа) входящего файла
try (Workspace workspace = workspaceFactory.create()) {
updateStatus(bucket, entityId, AudioProcessingStatus.ANALYZING, null);
Copy link

@coderabbitai coderabbitai bot Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Reject unknown buckets before starting the pipeline.

Lines 58-63 only handle "references" and "submissions", but Lines 80-86 route every other bucket to resultRepository. An unexpected bucket can still be marked ANALYZING, skip both save paths, and return without ever reaching COMPLETED. Validate the bucket once up front and keep updateStatus(...) aligned with that same two-value contract.

🛠️ Proposed fix
 public void execute(String bucket, String fileKey) {
+    if (!"references".equals(bucket) && !"submissions".equals(bucket)) {
+        throw new IllegalArgumentException("Unsupported bucket: " + bucket);
+    }
+
     UUID entityId = extractUuid(fileKey);

     // TODO: Добавить обработку(проверку типа) входящего файла
@@
     private void updateStatus(String bucket, UUID id, AudioProcessingStatus status, String error) {
         try {
             if ("references".equals(bucket)) {
                 referenceRepository.updateStatus(id, status, error);
-            } else {
+            } else if ("submissions".equals(bucket)) {
                 resultRepository.updateStatus(id, status, error);
             }
         } catch (Exception ex) {

Also applies to: 58-63, 80-86

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/application/AudioAnalysisUseCase.java`
around lines 30 - 38, In execute (AudioAnalysisUseCase.execute) validate the
incoming bucket string up front against the allowed set ("references",
"submissions"); if it's not one of those, log/mark an error and return before
creating a Workspace or calling updateStatus so we never set ANALYZING for
unknown buckets. Ensure the subsequent updateStatus(...) and downstream saves
only follow the same two-value contract (use referenceRepository for
"references" and resultRepository for "submissions") and do not attempt any save
or status transitions for invalid buckets.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
git log --oneline -5
cat backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/application/AudioAnalysisUseCase.java

Length of output: 5480


@Satlykovs, thanks for the update! However, looking at the latest commit (4a130f4), the fix appears to be only partially applied. Two issues from the original comment remain:

  1. No early bucket guard in execute — an unknown bucket still reaches updateStatus(ANALYZING, null) before any validation.
  2. updateStatus still uses a bare else — any bucket that isn't "references" (including unknown ones) still routes to resultRepository.updateStatus(...).

The downstream save block (else if ("submissions"...) is already correct, but the two items above still need to be addressed:

// In execute() — add this before extractUuid():
if (!"references".equals(bucket) && !"submissions".equals(bucket)) {
    throw new IllegalArgumentException("Unsupported bucket: " + bucket);
}

// In updateStatus() — change else → else if:
} else if ("submissions".equals(bucket)) {
    resultRepository.updateStatus(id, status, error);
}

Could you check if the changes were pushed correctly?


🧠 Learnings used
Learnt from: Satlykovs
Repo: Satlykovs/SmartJam PR: 5
File: backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/application/AudioAnalysisUseCase.java:23-39
Timestamp: 2026-03-09T16:58:58.342Z
Learning: In `backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/application/AudioAnalysisUseCase.java`, the `cleanWavFile` variable returned by `audioConverter.convertToStandardWav(...)` in the `execute(String bucket, String fileKey)` method is intentionally unused — it is a placeholder for downstream audio analysis logic (feature extraction / matching) that will be added in future PRs once the pipeline is complete.

Learnt from: Satlykovs
Repo: Satlykovs/SmartJam PR: 11
File: backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/application/AudioAnalysisUseCase.java:107-107
Timestamp: 2026-03-22T08:26:29.855Z
Learning: In `backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/application/AudioAnalysisUseCase.java`, the `debugVisualizer.generateHeatmap(...)` call inside `handleStudentSubmission` is intentionally debug/test-only and will be disabled (e.g., via a no-op bean or feature flag) in production. Failures from this call in non-production environments are considered acceptable.

Learnt from: Satlykovs
Repo: Satlykovs/SmartJam PR: 4
File: backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/service/StorageService.java:31-33
Timestamp: 2026-03-06T21:19:19.153Z
Learning: In `backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/service/StorageService.java`, S3 object keys are planned to be UUIDs generated by the API in future PRs, which eliminates filename-length concerns for temp file creation. Until that migration, arbitrary key names are still accepted by the current code.

Comment on lines +24 to +26
@Modifying(clearAutomatically = true)
@Query("UPDATE SubmissionEntity s SET s.status = :status, s.errorMessage = :error WHERE s.id " + "= :id")
void updateStatus(UUID id, AudioProcessingStatus status, String error);
Copy link

@coderabbitai coderabbitai bot Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

find . -type f -name "JpaSubmissionRepository.java" -o -name "JpaAssignmentRepository.java"

Repository: Satlykovs/SmartJam

Length of output: 339


🏁 Script executed:

wc -l ./backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/persistence/repository/JpaSubmissionRepository.java ./backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/persistence/repository/JpaAssignmentRepository.java

Repository: Satlykovs/SmartJam

Length of output: 360


🏁 Script executed:

cat -n ./backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/persistence/repository/JpaSubmissionRepository.java
echo "---"
cat -n ./backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/persistence/repository/JpaAssignmentRepository.java

Repository: Satlykovs/SmartJam

Length of output: 2573


🏁 Script executed:

rg "updateStatus" --type java -B 2 -A 2

Repository: Satlykovs/SmartJam

Length of output: 10243


Add explicit @Param bindings and return int to verify update success.

This method has the same named-parameter binding and void return issue as JpaAssignmentRepository.updateStatus(...). Without explicit @Param annotations, parameter binding can be fragile, and the void return type prevents callers from detecting update failures (zero rows affected typically indicates the entity was not found).

Proposed fix
 import org.springframework.data.jpa.repository.JpaRepository;
 import org.springframework.data.jpa.repository.Modifying;
 import org.springframework.data.jpa.repository.Query;
+import org.springframework.data.repository.query.Param;
@@
-    void updateStatus(UUID id, AudioProcessingStatus status, String error);
+    int updateStatus(
+            `@Param`("id") UUID id,
+            `@Param`("status") AudioProcessingStatus status,
+            `@Param`("error") String error);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/persistence/repository/JpaSubmissionRepository.java`
around lines 24 - 26, Update JpaSubmissionRepository.updateStatus to return an
int and use explicit `@Param` bindings: annotate the method parameters with
`@Param`("id"), `@Param`("status") and `@Param`("error") so the named parameters in
the JPQL ("UPDATE SubmissionEntity s SET s.status = :status, s.errorMessage =
:error WHERE s.id = :id") are unambiguous, and change the return type from void
to int so callers can check the number of rows affected (0 means no update).
Keep the `@Modifying` annotation and ensure parameter names exactly match the JPQL
names (status, error, id) and the AudioProcessingStatus type for the status
parameter.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need, modern versions don't need it

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Satlykovs, you're right — that's a fair point. Spring Boot's spring-boot-starter-parent configures the Maven compiler plugin with <parameters>true</parameters> by default, so the -parameters flag preserves method parameter names in the class file at runtime, making Spring Data JPA able to resolve named JPQL parameters without explicit @Param annotations. I'll withdraw that part of my suggestion.

That said, the separate concern about the void return type still stands — returning int from @Modifying queries allows callers to detect when zero rows were updated (e.g., when the entity doesn't exist), which is useful for catching silent no-ops. That's entirely optional if you prefer to handle that logic elsewhere (e.g., by checking existence first), so I'll leave it to your judgment.


✏️ Learnings added
Learnt from: Satlykovs
Repo: Satlykovs/SmartJam PR: 11
File: backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/persistence/repository/JpaSubmissionRepository.java:24-26
Timestamp: 2026-03-22T11:15:06.221Z
Learning: In SmartJam (Satlykovs/SmartJam), Spring Boot's spring-boot-starter-parent configures the Maven compiler plugin with the `-parameters` flag by default. Therefore, explicit `Param` annotations are NOT required on `Query`-annotated Spring Data JPA repository methods, as long as method parameter names match the named JPQL parameter placeholders. Do not flag missing `Param` annotations in this project.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: Satlykovs
Repo: Satlykovs/SmartJam PR: 0
File: :0-0
Timestamp: 2026-03-16T23:56:57.162Z
Learning: In SmartJam (Satlykovs/SmartJam), both UserRepository and RefreshTokenRepository incorrectly extend JpaRepository with Long as the ID type, but both UserEntity and RefreshTokenEntity use UUID as the primary key (GenerationType.UUID). The correct declarations are JpaRepository<UserEntity, UUID> and JpaRepository<RefreshTokenEntity, UUID>.

Learnt from: men229
Repo: Satlykovs/SmartJam PR: 2
File: backend/smartjam-api/src/main/java/com/smartjam/smartjamapi/entity/UserEntity.java:25-26
Timestamp: 2026-03-16T23:31:36.243Z
Learning: In SmartJam (Satlykovs/SmartJam), UserEntity uses `email` as the unique login identifier (unique = true in Column). The `username`/`nickname` field is a non-unique display name and does not require a uniqueness constraint. Repository methods like `existsByUsername()` / `findUserEntitiesByUsername()` may have been renamed to reflect `nickname`. Do not flag missing uniqueness on the nickname/display-name field. Applies to backend/smartjam-api/src/main/java/com/smartjam/smartjamapi/entity/UserEntity.java and related repository/service files.

@Satlykovs Satlykovs force-pushed the satlykovs/feat/analyzer-persistence branch from 0c7ea85 to 7f82b23 Compare March 22, 2026 11:16
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (1)
backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/application/AudioAnalysisUseCase.java (1)

30-38: ⚠️ Potential issue | 🟠 Major

Reject unsupported buckets before setting ANALYZING.

Unsupported buckets still hit updateStatus(...) via the catch-all else, skip both handleTeacherReference(...) and handleStudentSubmission(...), and return without a terminal state. Validate the bucket once up front, and keep updateStatus(...) on the same two-bucket contract.

🛠️ Minimal fix
 public void execute(String bucket, String fileKey) {
+    if (!"references".equals(bucket) && !"submissions".equals(bucket)) {
+        throw new IllegalArgumentException("Unsupported bucket: " + bucket);
+    }
+
     UUID entityId = extractUuid(fileKey);
@@
     private void updateStatus(String bucket, UUID id, AudioProcessingStatus status, String error) {
         try {
             if ("references".equals(bucket)) {
                 referenceRepository.updateStatus(id, status, error);
-            } else {
+            } else if ("submissions".equals(bucket)) {
                 resultRepository.updateStatus(id, status, error);
             }
         } catch (Exception ex) {

Also applies to: 58-63, 80-86

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/application/AudioAnalysisUseCase.java`
around lines 30 - 38, In execute(...) validate the incoming bucket before
calling updateStatus(..., AudioProcessingStatus.ANALYZING, ...): check the
bucket value immediately after extractUuid(fileKey) and return with a proper
terminal status or throw if unsupported so unsupported buckets never fall
through to the catch-all else; then keep the existing updateStatus call only for
the two supported branches handled by handleTeacherReference(...) and
handleStudentSubmission(...). Locate and apply the same upfront bucket
validation logic in the other similar blocks that call updateStatus (the ones
surrounding handleTeacherReference and handleStudentSubmission) so all paths
enforce the two-bucket contract.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/application/AudioAnalysisUseCase.java`:
- Around line 123-130: The extractUuid method currently fails on prefixed keys
and throws IllegalArgumentException on parse errors; change extractUuid to first
isolate the last path segment (e.g., take substring after the last '/' if
present), then strip the file extension (substring before last '.'), then
attempt UUID.fromString on that raw segment; on failure do NOT throw
IllegalArgumentException—either return Optional<UUID> (or null) or throw a
specific checked/handled exception (e.g., InvalidFileKeyException) so callers
(the surrounding processing flow) can record a persisted failure state in the
database instead of letting the upload be dropped silently; update all call
sites of extractUuid accordingly (or adapt to Optional) so parse failures are
handled and persisted.

In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/domain/port/ResultRepository.java`:
- Around line 23-30: The JavaDoc for
ResultRepository.findAssignmentIdBySubmissionId(UUID) incorrectly states it
throws and returns a UUID; update the doc to reflect it returns an
Optional<UUID> and does not throw on missing linkage. Specifically, change the
`@return` to indicate "Optional containing the parent assignment UUID if found, or
Optional.empty() if not" and remove the `@throws` tag; note callers like
SubmissionPersistenceAdapter.findAssignmentIdBySubmissionId(...) return
Optional.empty() and AudioAnalysisUseCase performs the orElseThrow.

In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/utils/FeatureBinarySerializer.java`:
- Around line 51-56: The multiplication computing totalPayloadBytes in
FeatureBinarySerializer (used in serialize and deserialize) can overflow before
the overflow check; replace the raw multiplies with checked multiplications
(e.g., use Math.multiplyExact for frameCount * binCount and then multiplyExact
by Float.BYTES) and catch ArithmeticException to throw an
IllegalArgumentException with the existing message; apply the same
checked-multiply approach to the other payload-size calculations around lines
95-100 so all paths reject oversized matrices before any cast/array allocation
and still include HEADER_SIZE in the final bounds check.
- Around line 35-36: The serializer currently returns new byte[0] for empty
FeatureSequence which deserialize treats as null; instead have
FeatureBinarySerializer.serialize(FeatureSequence) emit a valid header-sized
payload (populate the HEADER_SIZE bytes with the same format used for non-empty
sequences but with frame count = 0) so an empty sequence round-trips, and ensure
deserialize(byte[] payload) continues to interpret a header with frame count 0
as an empty FeatureSequence (not null); update both serialize(...) occurrences
(the empty-case at sequence.frames().isEmpty() and the similar block at lines
~80-83) and reference HEADER_SIZE, serialize, deserialize, and the frame count
field when implementing the fix.

---

Duplicate comments:
In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/application/AudioAnalysisUseCase.java`:
- Around line 30-38: In execute(...) validate the incoming bucket before calling
updateStatus(..., AudioProcessingStatus.ANALYZING, ...): check the bucket value
immediately after extractUuid(fileKey) and return with a proper terminal status
or throw if unsupported so unsupported buckets never fall through to the
catch-all else; then keep the existing updateStatus call only for the two
supported branches handled by handleTeacherReference(...) and
handleStudentSubmission(...). Locate and apply the same upfront bucket
validation logic in the other similar blocks that call updateStatus (the ones
surrounding handleTeacherReference and handleStudentSubmission) so all paths
enforce the two-bucket contract.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: d25fe1cb-9d11-4b45-b9fd-b9dba3e95290

📥 Commits

Reviewing files that changed from the base of the PR and between a55720a and 0c7ea85.

📒 Files selected for processing (4)
  • backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/application/AudioAnalysisUseCase.java
  • backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/domain/port/ResultRepository.java
  • backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/persistence/adapter/SubmissionPersistenceAdapter.java
  • backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/utils/FeatureBinarySerializer.java

Comment on lines +123 to +130
private UUID extractUuid(String fileKey) {
try {
String rawUuid = fileKey.contains(".") ? fileKey.substring(0, fileKey.lastIndexOf(".")) : fileKey;

return UUID.fromString(rawUuid);
} catch (Exception e) {
throw new IllegalArgumentException("Invalid S3 file key. Expected UUID, got: " + fileKey);
}
Copy link

@coderabbitai coderabbitai bot Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
for name in StorageService.java S3StorageListener.java; do
  fd "^${name}$" backend -x sh -c '
    echo "--- $1 ---"
    sed -n "1,220p" "$1"
  ' sh {}
done

echo "--- key generation / upload sites ---"
rg -n --type java -g '!**/test/**' 'UUID\.randomUUID|putObject|fileKey|objectKey|upload.*(Audio|File)' backend/smartjam-api backend/smartjam-analyzer

Repository: Satlykovs/SmartJam

Length of output: 6019


🏁 Script executed:

#!/bin/bash
# Find StorageService in API (key generation)
fd "StorageService.java" backend/smartjam-api -x cat -n {}

# Check for UUID generation in API
echo "--- UUID generation in API ---"
rg -n --type java 'UUID\.randomUUID' backend/smartjam-api

# Find upload endpoints and key assignment logic
echo "--- Upload logic and key assignment ---"
rg -n --type java -A5 -B2 'putObject|fileKey|objectKey|@PostMapping.*upload|@PostMapping.*audio' backend/smartjam-api | head -100

Repository: Satlykovs/SmartJam

Length of output: 132


🏁 Script executed:

#!/bin/bash
# Find all modules/directories
echo "=== Repository structure ==="
ls -la backend/

echo -e "\n=== Java files in smartjam-api ==="
find backend/smartjam-api -name "*.java" -type f | head -30

echo -e "\n=== All classes with 'Upload' or 'Storage' in smartjam-api ==="
find backend/smartjam-api -name "*.java" -type f | xargs grep -l "Upload\|Storage" 2>/dev/null

echo -e "\n=== Key generation patterns across entire backend ==="
rg -n --type java 'UUID\.randomUUID|S3|putObject' backend --max-count=20

Repository: Satlykovs/SmartJam

Length of output: 8099


🏁 Script executed:

#!/bin/bash
echo "=== S3StorageListenerTest key examples ==="
cat -n backend/smartjam-analyzer/src/test/java/com/smartjam/smartjamanalyzer/api/listener/S3StorageListenerTest.java

echo -e "\n=== Check docker-compose for clues about upload source ==="
grep -n -A5 -B5 "upload\|minio\|s3" docker-compose.yaml

echo -e "\n=== Check if there's a frontend or external documentation ==="
find . -name "*.md" -o -name "*.txt" | xargs grep -l "upload\|S3 key\|fileKey" 2>/dev/null | head -5

Repository: Satlykovs/SmartJam

Length of output: 4477


🏁 Script executed:

#!/bin/bash
echo "=== AudioAnalysisUseCase tests ==="
find backend/smartjam-analyzer -name "*AudioAnalysisUseCase*Test*.java" -o -name "*Test*AudioAnalysisUseCase*.java" | xargs cat -n 2>/dev/null

echo -e "\n=== Full AudioAnalysisUseCase.execute() and extractUuid() ==="
sed -n '30,132p' backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/application/AudioAnalysisUseCase.java | cat -n

echo -e "\n=== Check if there are prefixed keys anywhere ==="
rg -n 'folder/|uploads/|audio/|prefix' backend --type java

Repository: Satlykovs/SmartJam

Length of output: 11339


UUID format enforcement happens too early, before producer-side migration to UUID keys is complete.

The extractUuid() method rejects any non-UUID object key immediately (line 31), throwing an exception before the database can record a persisted failure state. Additionally, prefixed keys like folder/<uuid>.wav are unsupported because the extension-stripping logic does not remove path segments.

Verify that the producer side (smartjam-api uploader) now guarantees all S3 object keys are bare UUIDs before this strict parsing is safe. If uploads still use arbitrary filenames or folder prefixes, this will silently drop uploads without proper error tracking.

private UUID extractUuid(String fileKey) {
    try {
        String rawUuid = fileKey.contains(".") ? fileKey.substring(0, fileKey.lastIndexOf(".")) : fileKey;
        return UUID.fromString(rawUuid);
    } catch (Exception e) {
        throw new IllegalArgumentException("Invalid S3 file key. Expected UUID, got: " + fileKey);
    }
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/application/AudioAnalysisUseCase.java`
around lines 123 - 130, The extractUuid method currently fails on prefixed keys
and throws IllegalArgumentException on parse errors; change extractUuid to first
isolate the last path segment (e.g., take substring after the last '/' if
present), then strip the file extension (substring before last '.'), then
attempt UUID.fromString on that raw segment; on failure do NOT throw
IllegalArgumentException—either return Optional<UUID> (or null) or throw a
specific checked/handled exception (e.g., InvalidFileKeyException) so callers
(the surrounding processing flow) can record a persisted failure state in the
database instead of letting the upload be dropped silently; update all call
sites of extractUuid accordingly (or adapt to Optional) so parse failures are
handled and persisted.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But if the key is wrong, we will be unable to find the record with that key, because it's invalid UUID

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Satlykovs, that's a fair and correct point. If the file key isn't a valid UUID, there's no entity in the database keyed to that value — so there's nothing to update the status on. The entityId itself would never be assigned, making any updateStatus(...) call impossible regardless. Throwing IllegalArgumentException early and letting the S3 listener handle it at a higher level (e.g., logging, dead-lettering) is the right behavior here.

I'll withdraw this concern — the current implementation is correct.


🧠 Learnings used
Learnt from: Satlykovs
Repo: Satlykovs/SmartJam PR: 5
File: backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/application/AudioAnalysisUseCase.java:23-39
Timestamp: 2026-03-09T16:58:58.342Z
Learning: In `backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/application/AudioAnalysisUseCase.java`, the `cleanWavFile` variable returned by `audioConverter.convertToStandardWav(...)` in the `execute(String bucket, String fileKey)` method is intentionally unused — it is a placeholder for downstream audio analysis logic (feature extraction / matching) that will be added in future PRs once the pipeline is complete.

Learnt from: Satlykovs
Repo: Satlykovs/SmartJam PR: 11
File: backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/application/AudioAnalysisUseCase.java:107-107
Timestamp: 2026-03-22T08:26:29.855Z
Learning: In `backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/application/AudioAnalysisUseCase.java`, the `debugVisualizer.generateHeatmap(...)` call inside `handleStudentSubmission` is intentionally debug/test-only and will be disabled (e.g., via a no-op bean or feature flag) in production. Failures from this call in non-production environments are considered acceptable.

Learnt from: Satlykovs
Repo: Satlykovs/SmartJam PR: 4
File: backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/service/StorageService.java:31-33
Timestamp: 2026-03-06T21:19:19.153Z
Learning: In `backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/service/StorageService.java`, S3 object keys are planned to be UUIDs generated by the API in future PRs, which eliminates filename-length concerns for temp file creation. Until that migration, arbitrary key names are still accepted by the current code.

Learnt from: Satlykovs
Repo: Satlykovs/SmartJam PR: 0
File: :0-0
Timestamp: 2026-03-16T23:56:57.162Z
Learning: In SmartJam (Satlykovs/SmartJam), both UserRepository and RefreshTokenRepository incorrectly extend JpaRepository with Long as the ID type, but both UserEntity and RefreshTokenEntity use UUID as the primary key (GenerationType.UUID). The correct declarations are JpaRepository<UserEntity, UUID> and JpaRepository<RefreshTokenEntity, UUID>.

Comment on lines +35 to +36
if (sequence.frames().isEmpty()) {
return new byte[0];
Copy link

@coderabbitai coderabbitai bot Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Empty feature sequences are not round-trippable.

serialize() writes new byte[0] for an empty sequence, but deserialize() maps any payload shorter than HEADER_SIZE to null. In backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/persistence/adapter/AssignmentPersistenceAdapter.java:51-57, that becomes Optional.empty(), so a persisted empty cache later looks identical to “missing” or truncated data and AudioAnalysisUseCase fails with “Teacher reference features not found”.

🛠️ One safe way to preserve empty sequences
-        if (sequence.frames().isEmpty()) {
-            return new byte[0];
-        }
-
         List<float[]> frames = sequence.frames();
         int frameCount = frames.size();
-        int binCount = frames.getFirst().length;
+        int binCount = frameCount == 0 ? 0 : frames.get(0).length;
         float frameRate = sequence.frameRate();
@@
-        if (data == null || data.length < HEADER_SIZE) {
+        if (data == null) {
             return null;
         }
+        if (data.length < HEADER_SIZE) {
+            throw new IllegalArgumentException(
+                    "Data too short. Expected at least " + HEADER_SIZE + " bytes, got " + data.length);
+        }

Also applies to: 80-83

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/utils/FeatureBinarySerializer.java`
around lines 35 - 36, The serializer currently returns new byte[0] for empty
FeatureSequence which deserialize treats as null; instead have
FeatureBinarySerializer.serialize(FeatureSequence) emit a valid header-sized
payload (populate the HEADER_SIZE bytes with the same format used for non-empty
sequences but with frame count = 0) so an empty sequence round-trips, and ensure
deserialize(byte[] payload) continues to interpret a header with frame count 0
as an empty FeatureSequence (not null); update both serialize(...) occurrences
(the empty-case at sequence.frames().isEmpty() and the similar block at lines
~80-83) and reference HEADER_SIZE, serialize, deserialize, and the frame count
field when implementing the fix.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, thats because we SERIALZE an object into bytes and DESERIALIZE bytes to an object. Am I wrong?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Satlykovs, you're conceptually right — serialize converts an object to bytes and deserialize converts those bytes back. The concern is specifically that your current implementation violates that contract for empty sequences:

  1. serialize(emptySequence) → returns new byte[0] (0 bytes)
  2. deserialize(new byte[0]) → checks data.length < HEADER_SIZE (0 < 12) → returns null

So the round-trip for an empty FeatureSequence is:

emptySequence → serialize → byte[0] → deserialize → null

null ≠ empty sequence. The identity deserialize(serialize(x)) == x breaks precisely for the empty case.

In AssignmentPersistenceAdapter, this means a stored empty cache is indistinguishable from "no data / missing" — both produce Optional.empty(), and then AudioAnalysisUseCase would throw "Teacher reference features not found".

The fix is to always write a full HEADER_SIZE payload (with frameCount = 0) even for empty sequences, so deserialize can reconstruct an empty FeatureSequence instead of returning null. In practice, empty sequences may never occur in production (a real audio file will always produce frames), but keeping serialize/deserialize as true inverses is good defensive design.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, you want not to use null, but construct an empty object or what?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Satlykovs, exactly — instead of null, return a proper empty FeatureSequence object. Here's the idea concisely:

serialize — always write the 12-byte header, even for empty sequences:

// Instead of:
if (sequence.frames().isEmpty()) {
    return new byte[0];
}

// Do this (frameCount=0, binCount=0 written into the header):
int frameCount = frames.size();               // 0
int binCount = frameCount == 0 ? 0 : frames.get(0).length;  // 0
// ... allocate HEADER_SIZE + 0 bytes and write the header normally

deserialize — when the header says frameCount == 0, return an empty sequence, not null:

if (frameCount == 0) {
    return new FeatureSequence(List.of(), frameRate);  // empty, but not null
}

So the round-trip becomes:

emptySequence → serialize → 12-byte header → deserialize → emptySequence ✅

Right now it's:

emptySequence → serialize → byte[0] → deserialize → null ❌

In practice, real audio likely never produces 0 frames, so this may never trigger in production. But it's good defensive design to keep deserialize(serialize(x)) == x true for all inputs.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But in our domain, FeatureSequence CANT be without frames, it will throw an Exception

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find and display the FeatureSequence domain model
fd "FeatureSequence.java" --exec cat {}

Length of output: 1536


@Satlykovs, you're absolutely right — my concern was invalid. The FeatureSequence compact constructor enforces the invariant explicitly:

if (frames == null || frames.isEmpty()) {
    throw new IllegalArgumentException("Frames list cannot be null or empty");
}

Since FeatureSequence cannot be constructed with empty or null frames, serialize() will never receive such input. The sequence.frames().isEmpty() early-return branch is defensive dead code — the scenario I was worried about simply cannot happen at the domain level. The serialize/deserialize contract is fine as-is. I apologize for the noise!


✏️ Learnings added
Learnt from: Satlykovs
Repo: Satlykovs/SmartJam PR: 11
File: backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/utils/FeatureBinarySerializer.java:35-36
Timestamp: 2026-03-22T11:39:48.306Z
Learning: In the SmartJam analyzer (`backend/smartjam-analyzer`), the `FeatureSequence` domain model (Java record in `com.smartjam.smartjamanalyzer.domain.model`) enforces via its compact constructor that `frames` must be non-null and non-empty, throwing `IllegalArgumentException` otherwise. An empty `FeatureSequence` is therefore impossible by domain invariant, making any guard for empty frames in `FeatureBinarySerializer.serialize()` unreachable dead code.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Comment on lines +51 to +56
long totalPayloadBytes = (long) frameCount * binCount * Float.BYTES;
if (totalPayloadBytes > Integer.MAX_VALUE - HEADER_SIZE) {
throw new IllegalArgumentException("Matrix is too large for binary serialization");
}

int totalBytes = HEADER_SIZE + (int) totalPayloadBytes;
Copy link

@coderabbitai coderabbitai bot Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

totalPayloadBytes can still overflow before the guard runs.

(long) frameCount * binCount * Float.BYTES is still unsafe for extreme dimensions. With headers like frameCount = binCount = Integer.MAX_VALUE, the multiplication wraps, expectedBytes becomes tiny/negative, and deserialize() can fall through into enormous allocations instead of rejecting the blob.

🛠️ Use checked multiplication in both code paths
-        long totalPayloadBytes = (long) frameCount * binCount * Float.BYTES;
+        long totalElements;
+        long totalPayloadBytes;
+        try {
+            totalElements = Math.multiplyExact((long) frameCount, binCount);
+            totalPayloadBytes = Math.multiplyExact(totalElements, (long) Float.BYTES);
+        } catch (ArithmeticException ex) {
+            throw new IllegalArgumentException("Matrix is too large for binary serialization", ex);
+        }
         if (totalPayloadBytes > Integer.MAX_VALUE - HEADER_SIZE) {
             throw new IllegalArgumentException("Matrix is too large for binary serialization");
         }
@@
-        long totalPayloadBytes = (long) frameCount * binCount * Float.BYTES;
+        long totalElements;
+        long totalPayloadBytes;
+        try {
+            totalElements = Math.multiplyExact((long) frameCount, binCount);
+            totalPayloadBytes = Math.multiplyExact(totalElements, (long) Float.BYTES);
+        } catch (ArithmeticException ex) {
+            throw new IllegalArgumentException("Too many elements to deserialize", ex);
+        }
         if (totalPayloadBytes > Integer.MAX_VALUE - HEADER_SIZE) {
             throw new IllegalArgumentException("Too many elements to deserialize: " + totalPayloadBytes);
         }

Also applies to: 95-100

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/utils/FeatureBinarySerializer.java`
around lines 51 - 56, The multiplication computing totalPayloadBytes in
FeatureBinarySerializer (used in serialize and deserialize) can overflow before
the overflow check; replace the raw multiplies with checked multiplications
(e.g., use Math.multiplyExact for frameCount * binCount and then multiplyExact
by Float.BYTES) and catch ArithmeticException to throw an
IllegalArgumentException with the existing message; apply the same
checked-multiply approach to the other payload-size calculations around lines
95-100 so all paths reject oversized matrices before any cast/array allocation
and still include HEADER_SIZE in the final bounds check.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
fd "FeatureBinarySerializer.java" --exec cat {}

Length of output: 4110


@Satlykovs The fix does not appear to have been applied yet — the current code still uses the raw multiplication in both paths:

// serialize() — line ~51
long totalPayloadBytes = (long) frameCount * binCount * Float.BYTES;

// deserialize() — line ~95
long totalPayloadBytes = (long) frameCount * binCount * Float.BYTES;

The overflow risk is still present: when frameCount and binCount are both near Integer.MAX_VALUE, (long) frameCount * binCount can be up to ~4.6×10¹⁸, and multiplying by Float.BYTES (4) yields ~1.84×10¹⁹ — which exceeds Long.MAX_VALUE (~9.2×10¹⁸) and wraps to a negative value, bypassing the guard.

Here is the corrected implementation for both methods:

// In serialize()
long totalElements;
long totalPayloadBytes;
try {
    totalElements = Math.multiplyExact((long) frameCount, binCount);
    totalPayloadBytes = Math.multiplyExact(totalElements, (long) Float.BYTES);
} catch (ArithmeticException ex) {
    throw new IllegalArgumentException("Matrix is too large for binary serialization", ex);
}
if (totalPayloadBytes > Integer.MAX_VALUE - HEADER_SIZE) {
    throw new IllegalArgumentException("Matrix is too large for binary serialization");
}
// In deserialize()
long totalElements;
long totalPayloadBytes;
try {
    totalElements = Math.multiplyExact((long) frameCount, binCount);
    totalPayloadBytes = Math.multiplyExact(totalElements, (long) Float.BYTES);
} catch (ArithmeticException ex) {
    throw new IllegalArgumentException("Too many elements to deserialize", ex);
}
if (totalPayloadBytes > Integer.MAX_VALUE - HEADER_SIZE) {
    throw new IllegalArgumentException("Too many elements to deserialize: " + totalPayloadBytes);
}

🧠 Learnings used
Learnt from: Satlykovs
Repo: Satlykovs/SmartJam PR: 11
File: backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/utils/FeatureBinarySerializer.java:35-36
Timestamp: 2026-03-22T11:39:48.306Z
Learning: In the SmartJam analyzer (`backend/smartjam-analyzer`), the `FeatureSequence` domain model (Java record in `com.smartjam.smartjamanalyzer.domain.model`) enforces via its compact constructor that `frames` must be non-null and non-empty, throwing `IllegalArgumentException` otherwise. An empty `FeatureSequence` is therefore impossible by domain invariant, making any guard for empty frames in `FeatureBinarySerializer.serialize()` unreachable dead code.

@Satlykovs Satlykovs force-pushed the satlykovs/feat/analyzer-persistence branch 3 times, most recently from 035a351 to dff6548 Compare March 22, 2026 11:46
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (7)
backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/application/AudioAnalysisUseCase.java (2)

123-131: ⚠️ Potential issue | 🔴 Critical

extractUuid still doesn't handle prefixed S3 keys.

This was flagged in a previous review. The current implementation:

  • Strips file extension: folder/uuid.wavfolder/uuid
  • Fails on UUID.fromString("folder/uuid")
  • Throws IllegalArgumentException before any status can be persisted

Based on learnings, S3 keys are planned to be bare UUIDs but "arbitrary key names are still accepted by the current code." If prefixed keys arrive, they fail without any database tracking.

🛡️ Proposed fix to handle path prefixes
 private UUID extractUuid(String fileKey) {
     try {
+        // Extract filename from path (handle "folder/uuid.wav" → "uuid.wav")
+        String filename = fileKey.contains("/") 
+            ? fileKey.substring(fileKey.lastIndexOf("/") + 1) 
+            : fileKey;
+        // Strip extension
-        String rawUuid = fileKey.contains(".") ? fileKey.substring(0, fileKey.lastIndexOf(".")) : fileKey;
+        String rawUuid = filename.contains(".") 
+            ? filename.substring(0, filename.lastIndexOf(".")) 
+            : filename;
 
         return UUID.fromString(rawUuid);
     } catch (Exception e) {
         throw new IllegalArgumentException("Invalid S3 file key. Expected UUID, got: " + fileKey);
     }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/application/AudioAnalysisUseCase.java`
around lines 123 - 131, The extractUuid method fails on prefixed S3 keys (e.g.,
"folder/uuid.wav") because it only strips extensions and passes "folder/uuid" to
UUID.fromString; update extractUuid to first strip the extension, then extract
the final path segment (substring after the last '/' or '\\' if present) to
isolate the bare UUID before calling UUID.fromString(rawUuidSegment). Keep the
existing try/catch and IllegalArgumentException but ensure the rawUuidSegment is
what you validate so prefixed keys are accepted.

30-38: ⚠️ Potential issue | 🟠 Major

Unknown bucket values can leave orphaned ANALYZING status.

This was flagged in a previous review but appears unaddressed. If bucket is neither "references" nor "submissions":

  1. updateStatus() at line 38 routes to resultRepository (via the else branch at line 84-85)
  2. Neither handleTeacherReference nor handleStudentSubmission is invoked (lines 59-63)
  3. The entity remains in ANALYZING state indefinitely

Add early validation to reject unknown buckets before any processing begins.

🛡️ Proposed fix
 public void execute(String bucket, String fileKey) {
+    if (!"references".equals(bucket) && !"submissions".equals(bucket)) {
+        throw new IllegalArgumentException("Unsupported bucket: " + bucket);
+    }
+
     UUID entityId = extractUuid(fileKey);

Also tighten the updateStatus helper:

     private void updateStatus(String bucket, UUID id, AudioProcessingStatus status, String error) {
         try {
             if ("references".equals(bucket)) {
                 referenceRepository.updateStatus(id, status, error);
-            } else {
+            } else if ("submissions".equals(bucket)) {
                 resultRepository.updateStatus(id, status, error);
             }
         } catch (Exception ex) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/application/AudioAnalysisUseCase.java`
around lines 30 - 38, In AudioAnalysisUseCase.execute validate the incoming
bucket immediately after extracting the UUID and before creating the Workspace
or calling updateStatus for ANALYZING: check that bucket equals "references" or
"submissions", and if not call updateStatus(entityId,
AudioProcessingStatus.FAILED, "unknown bucket: " + bucket) (or the project's
error status) and return; this ensures unknown buckets don't leave an entity
stuck in ANALYZING and prevents skipping the proper handlers
(handleTeacherReference/handleStudentSubmission). Use the same updateStatus
helper and UUID from extractUuid so the failure is recorded against the correct
entity.
backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/domain/port/ReferenceRepository.java (1)

28-35: ⚠️ Potential issue | 🟡 Minor

JavaDoc @param description mismatch for updateStatus.

Line 31 documents assignmentId as "unique identifier of the submission", but this is the ReferenceRepository (for teacher assignments), not submissions. The parameter should be documented as the assignment identifier.

📝 Proposed fix
     /**
      * Performs an optimized status update for an assignment, optionally recording an error message.
      *
-     * `@param` assignmentId unique identifier of the submission.
+     * `@param` assignmentId unique identifier of the assignment.
      * `@param` status the new processing state.
      * `@param` errorMessage description of the failure, if applicable.
      */
     void updateStatus(UUID assignmentId, AudioProcessingStatus status, String errorMessage);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/domain/port/ReferenceRepository.java`
around lines 28 - 35, The JavaDoc for ReferenceRepository.updateStatus
incorrectly calls assignmentId a "submission" id; update the `@param`
documentation for assignmentId to say it is the assignment identifier (e.g.,
"unique identifier of the assignment") and ensure the other `@param` entries
(status, errorMessage) remain accurate; locate the updateStatus(UUID
assignmentId, AudioProcessingStatus status, String errorMessage) declaration in
ReferenceRepository and correct only the assignmentId description.
backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/domain/port/ResultRepository.java (1)

23-30: ⚠️ Potential issue | 🟡 Minor

JavaDoc still inconsistent with Optional<UUID> return type.

The method signature returns Optional<UUID>, but the JavaDoc at lines 27-28 still documents:

  • @return the UUID of the parent assignment (should mention Optional)
  • @throws RuntimeException if no linked assignment ID found (should be removed — the method returns Optional.empty() instead of throwing)

This was flagged in a previous review and marked as addressed, but the code still reflects the old contract.

📝 Proposed doc fix
     /**
      * Resolves the teacher's assignment ID associated with a given student submission.
      *
      * `@param` submissionId unique identifier of the submission.
-     * `@return` the UUID of the parent assignment.
-     * `@throws` RuntimeException if no linked assignment ID found.
+     * `@return` an {`@link` Optional} containing the parent assignment ID,
+     *         or {`@link` Optional#empty()} if the submission is not linked.
      */
     Optional<UUID> findAssignmentIdBySubmissionId(UUID submissionId);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/domain/port/ResultRepository.java`
around lines 23 - 30, The JavaDoc for findAssignmentIdBySubmissionId is
inconsistent with its Optional<UUID> return type: update the method description
to state it returns an Optional containing the parent assignment UUID (or
Optional.empty() when none exists), change the `@return` tag to mention
Optional<UUID>, and remove the incorrect `@throws` RuntimeException tag; ensure
the summary sentence and tags reflect the Optional contract for
findAssignmentIdBySubmissionId.
backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/utils/FeatureBinarySerializer.java (3)

81-83: ⚠️ Potential issue | 🟠 Major

Don’t treat non-null short payloads as “missing data”.

Line 81-83 maps malformed bytes (length < HEADER_SIZE) to null, which can silently hide corruption as an absent value. Keep null only for data == null; throw for short payload.

🛠️ Proposed fix
-        if (data == null || data.length < HEADER_SIZE) {
+        if (data == null) {
             return null;
         }
+        if (data.length < HEADER_SIZE) {
+            throw new IllegalArgumentException(
+                    "Data too short. Expected at least " + HEADER_SIZE + " bytes, got " + data.length);
+        }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/utils/FeatureBinarySerializer.java`
around lines 81 - 83, The check in FeatureBinarySerializer that returns null
when data.length < HEADER_SIZE hides malformed payloads; change it so only data
== null returns null and when data != null but shorter than HEADER_SIZE throw a
clear exception (e.g., IllegalArgumentException with a message mentioning
HEADER_SIZE and actual length) inside the same method where the current check
exists so callers can distinguish absent input from corrupted/short payloads.

51-53: ⚠️ Potential issue | 🔴 Critical

Use checked multiplication for payload size math.

Line 51 and Line 95 still use raw multiply chains; extreme dimensions can overflow before the guard and invalidate size checks.

🛠️ Proposed fix (both serialize/deserialize paths)
-        long totalPayloadBytes = (long) frameCount * binCount * Float.BYTES;
+        long totalElements;
+        long totalPayloadBytes;
+        try {
+            totalElements = Math.multiplyExact((long) frameCount, (long) binCount);
+            totalPayloadBytes = Math.multiplyExact(totalElements, (long) Float.BYTES);
+        } catch (ArithmeticException ex) {
+            throw new IllegalArgumentException("Matrix is too large for binary serialization", ex);
+        }
         if (totalPayloadBytes > Integer.MAX_VALUE - HEADER_SIZE) {
             throw new IllegalArgumentException("Matrix is too large for binary serialization");
         }
@@
-        long totalPayloadBytes = (long) frameCount * binCount * Float.BYTES;
+        long totalElements;
+        long totalPayloadBytes;
+        try {
+            totalElements = Math.multiplyExact((long) frameCount, (long) binCount);
+            totalPayloadBytes = Math.multiplyExact(totalElements, (long) Float.BYTES);
+        } catch (ArithmeticException ex) {
+            throw new IllegalArgumentException("Too many elements to deserialize", ex);
+        }
         if (totalPayloadBytes > Integer.MAX_VALUE - HEADER_SIZE) {
             throw new IllegalArgumentException("Too many elements to deserialize: " + totalPayloadBytes);
         }
#!/bin/bash
set -euo pipefail

echo "== Locations using raw payload multiplication =="
rg -nP --type=java '\(long\)\s*frameCount\s*\*\s*binCount\s*\*\s*Float\.BYTES'

echo
echo "== Demonstrate numeric overflow envelope (Java long limit) =="
python - <<'PY'
LONG_MAX = 2**63 - 1
n = 2_147_483_647  # Integer.MAX_VALUE
payload = n * n * 4
print("payload bytes:", payload)
print("Long.MAX_VALUE:", LONG_MAX)
print("payload > Long.MAX_VALUE:", payload > LONG_MAX)
PY

Also applies to: 95-100

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/utils/FeatureBinarySerializer.java`
around lines 51 - 53, The payload-size check in FeatureBinarySerializer uses raw
chained multiplication (frameCount * binCount * Float.BYTES) which can overflow
before the guard; replace those chains in both the serialize and deserialize
paths by performing checked multiplications (e.g., use Math.multiplyExact
stepwise or equivalent checked casts) when computing totalPayloadBytes from
frameCount, binCount, and Float.BYTES, then compare the resulting long safely
against Integer.MAX_VALUE - HEADER_SIZE; update both places that reference
frameCount/binCount/Float.BYTES (the totalPayloadBytes calculation around the
serialize method and the similar calculation in the deserialize path) to use the
checked-multiply approach and preserve the existing IllegalArgumentException
behavior if the checked multiplication throws or the final size exceeds the
allowed limit.

35-37: 🧹 Nitpick | 🔵 Trivial

Remove unreachable empty-sequence branch in serialize(...).

Line 35-37 returns new byte[0], but this branch should be unreachable under your domain invariant and can hide invalid state.

♻️ Proposed refactor
-        if (sequence.frames().isEmpty()) {
-            return new byte[0];
-        }
-
         List<float[]> frames = sequence.frames();

Based on learnings: In FeatureSequence, the compact constructor enforces non-null and non-empty frames, so this empty-check branch is dead code.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/utils/FeatureBinarySerializer.java`
around lines 35 - 37, Remove the unreachable empty-sequence branch in
FeatureBinarySerializer.serialize(...): delete the if
(sequence.frames().isEmpty()) { return new byte[0]; } branch and instead assume
FeatureSequence enforces non-empty frames; to fail fast if the invariant is ever
violated, replace the silent return with an explicit check that throws (e.g.,
IllegalStateException or Preconditions.checkArgument) or an
Objects.requireNonNull/assert on sequence.frames() before proceeding,
referencing the serialize(...) method and the FeatureSequence compact
constructor invariant so invalid state isn't silently swallowed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/application/AudioAnalysisUseCase.java`:
- Around line 123-131: The extractUuid method fails on prefixed S3 keys (e.g.,
"folder/uuid.wav") because it only strips extensions and passes "folder/uuid" to
UUID.fromString; update extractUuid to first strip the extension, then extract
the final path segment (substring after the last '/' or '\\' if present) to
isolate the bare UUID before calling UUID.fromString(rawUuidSegment). Keep the
existing try/catch and IllegalArgumentException but ensure the rawUuidSegment is
what you validate so prefixed keys are accepted.
- Around line 30-38: In AudioAnalysisUseCase.execute validate the incoming
bucket immediately after extracting the UUID and before creating the Workspace
or calling updateStatus for ANALYZING: check that bucket equals "references" or
"submissions", and if not call updateStatus(entityId,
AudioProcessingStatus.FAILED, "unknown bucket: " + bucket) (or the project's
error status) and return; this ensures unknown buckets don't leave an entity
stuck in ANALYZING and prevents skipping the proper handlers
(handleTeacherReference/handleStudentSubmission). Use the same updateStatus
helper and UUID from extractUuid so the failure is recorded against the correct
entity.

In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/domain/port/ReferenceRepository.java`:
- Around line 28-35: The JavaDoc for ReferenceRepository.updateStatus
incorrectly calls assignmentId a "submission" id; update the `@param`
documentation for assignmentId to say it is the assignment identifier (e.g.,
"unique identifier of the assignment") and ensure the other `@param` entries
(status, errorMessage) remain accurate; locate the updateStatus(UUID
assignmentId, AudioProcessingStatus status, String errorMessage) declaration in
ReferenceRepository and correct only the assignmentId description.

In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/domain/port/ResultRepository.java`:
- Around line 23-30: The JavaDoc for findAssignmentIdBySubmissionId is
inconsistent with its Optional<UUID> return type: update the method description
to state it returns an Optional containing the parent assignment UUID (or
Optional.empty() when none exists), change the `@return` tag to mention
Optional<UUID>, and remove the incorrect `@throws` RuntimeException tag; ensure
the summary sentence and tags reflect the Optional contract for
findAssignmentIdBySubmissionId.

In
`@backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/utils/FeatureBinarySerializer.java`:
- Around line 81-83: The check in FeatureBinarySerializer that returns null when
data.length < HEADER_SIZE hides malformed payloads; change it so only data ==
null returns null and when data != null but shorter than HEADER_SIZE throw a
clear exception (e.g., IllegalArgumentException with a message mentioning
HEADER_SIZE and actual length) inside the same method where the current check
exists so callers can distinguish absent input from corrupted/short payloads.
- Around line 51-53: The payload-size check in FeatureBinarySerializer uses raw
chained multiplication (frameCount * binCount * Float.BYTES) which can overflow
before the guard; replace those chains in both the serialize and deserialize
paths by performing checked multiplications (e.g., use Math.multiplyExact
stepwise or equivalent checked casts) when computing totalPayloadBytes from
frameCount, binCount, and Float.BYTES, then compare the resulting long safely
against Integer.MAX_VALUE - HEADER_SIZE; update both places that reference
frameCount/binCount/Float.BYTES (the totalPayloadBytes calculation around the
serialize method and the similar calculation in the deserialize path) to use the
checked-multiply approach and preserve the existing IllegalArgumentException
behavior if the checked multiplication throws or the final size exceeds the
allowed limit.
- Around line 35-37: Remove the unreachable empty-sequence branch in
FeatureBinarySerializer.serialize(...): delete the if
(sequence.frames().isEmpty()) { return new byte[0]; } branch and instead assume
FeatureSequence enforces non-empty frames; to fail fast if the invariant is ever
violated, replace the silent return with an explicit check that throws (e.g.,
IllegalStateException or Preconditions.checkArgument) or an
Objects.requireNonNull/assert on sequence.frames() before proceeding,
referencing the serialize(...) method and the FeatureSequence compact
constructor invariant so invalid state isn't silently swallowed.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 417f1d26-3838-4140-98ad-e15127eec90f

📥 Commits

Reviewing files that changed from the base of the PR and between 0c7ea85 and dff6548.

📒 Files selected for processing (5)
  • backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/application/AudioAnalysisUseCase.java
  • backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/domain/port/ReferenceRepository.java
  • backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/domain/port/ResultRepository.java
  • backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/persistence/adapter/SubmissionPersistenceAdapter.java
  • backend/smartjam-analyzer/src/main/java/com/smartjam/smartjamanalyzer/infrastructure/utils/FeatureBinarySerializer.java

@Satlykovs Satlykovs force-pushed the satlykovs/feat/analyzer-persistence branch from dff6548 to 4a130f4 Compare March 22, 2026 12:04
@Satlykovs Satlykovs force-pushed the satlykovs/feat/analyzer-persistence branch from 4a130f4 to 814ee9e Compare March 22, 2026 12:05
@Satlykovs Satlykovs merged commit 1f0d2be into main Mar 22, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants